Thesaurus Expansion using Similar Wor
نویسنده
چکیده
In both written and spoken languages, we sometimes use different words in order to describe the same meaning. For instance, we use “constraint” (seigen) and “restriction” (seiyaku) as the same meaning. This makes text classification and text summarization difficult. In order to deal with this problem, dictionaries especially thesauri are used. However, in technical paper and patent documents, a lot of new words which are not given in the dictionary. In this paper, we propose a method to accurately extract words which are semantically similar to each other. Using this method, we extracted similar word pairs from patent documents. We also expand a thesaurus using the extracted similar words.
منابع مشابه
English-Japanese Cross-lingual Query Expansion Using Random Indexing of Aligned Bilingual Text Data
Vector space models can be used for extracting semantically similar words from the co-occurrence statistics of words in large text data. In this paper, we report on our NTCIR 2002 experiments using the Random Indexing vector space method for extracting an English-Japanese cross-lingual thesaurus from aligned English-Japanese bilingual data. The crosslingual thesaurus has been used for automatic...
متن کاملThe Exploration and Analysis of Using Multiple Thesaurus Types for Query Expansion in Information Retrieval
This paper proposes the use of multiple thesaurus types for query expansion in information retrieval. Hand-crafted thesaurus, corpus-based co-occurrence-based thesaurus and syntactic-relation-based thesaurus are combined and used as a tool for query expansion. A simple word sense disambiguation is performed to avoid misleading expansion terms. Experiments using TREC-7 collection proved that thi...
متن کاملQuery Expansion using an Automatically Constructed Thesaurus
Our group participated in the Japanese and English Retrieval Subtasks of TCIR-6. Our goal was to evaluate the effectiveness of a thesaurus constructed from patents for invalidity search. To confirm the effectiveness of our thesaurus-based query expansion, we conducted experiments and found that our method can improve upon traditional document retrieval systems.
متن کاملAssessing the Impact of Thesaurus-Based Expansion Techniques in QA-Centric IR
In this paper, we assess the impact of using thesaurus-based query expansion methods, at the Information Retrieval (IR) stage of a Question Answering (QA) system. We focus on expanding queries for questions regarding actions and events, where verbs have particularly important roles. Two different thesaurus are used: the OpenOffice thesaurus and an automatically generated verb thesaurus. The per...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006